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Abstract 

The chloroplast is an essential plant organelle responsible for photosynthesis. Gene duplication, relocation, and loss in the chloroplast 
genome (cpDNA) are useful for exploring the evolution and phylogeny of plant species. In this study, the connplete chloroplast 
genome of Par/s\/e/t/c///a/-a was sequenced using the 454 sequencing system and Sanger sequencing method totrace the evolutionary 
pattern in the tribe Parideae of the family Melanthiaceae (Liliales). The circular double-stranded cpDNA of P. verticillata (1 57,379 bp) 
consists of two inverted repeat regions each of 28,373 bp, a large single copy of 82,726 bp, and a small single copy of 1 7,907 bp. 
Gene content and order are generally similar to the previously reported cpDNA sequences within the order Liliales. However, we 
found that trnLCAUwas triplicated in P. verticillata. In addition, cemA is suspected to be a pseudogene due to the presence of internal 
stop codons created by poly(A) insertion and single small CA repeats. Such changes were not found in previously examined cpDNAs of 
the Melanthiaceae or other families of the Liliales, suggesting that such features are unique to the tribe Parideae of Melanthiaceae. 
The characteristics of P. verticillata cpDNA will provide useful information for uncovering the evolution within Paris and for further 
research of plastid genome evolution and phylogenetic studies in Liliales. 
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Introduction 

The chloroplast of plants is believed to have evolved through 
an endosymbiotic event in which a eukaryotic heterotrophic 
organism became host to a cyanobacterium, with an interac- 
tion between them resulting in chloroplast formation (Douglas 
1998; McFadden 1999). It contains a circular double-stranded 
DNA molecule ranging in length from approximately 100kb 
to over 160kb (Sugiura 1992). The chloroplast genome 
(cpDNA) has a quadripartite structure, which includes a 
large single copy (LSC), a small single copy (SSC), and two 
inverted repeat (IR) regions. The chloroplast genome contains 
genes that are responsible for photosynthesis (Sugiura 1992) 
and is inherited maternally, paternally, and even biparentally 
(Corriveau and Coleman 1988; Dong et al. 1992; Birky 1995; 
Yang et al. 2000; Zhang et al. 2003; Hansen et al. 2007; 
McCauley et al. 2007). Generally, the gene content and 
order are highly conserved in cpDNA across plant species. 
Therefore, cpDNA protein-coding sequences are believed to 
provide useful information for exploring phylogenetic relation- 
ships among plant species. For example, Jansen et al. (2007) 



resolved the relationships among major clades of the angio- 
sperms by analyzing 81 genes from the plastid genomes of 64 
species. However, gene loss, relocation, and transformation to 
pseudogenes have also occurred. In parasitic plants such as 
those belonging to the genus Cuscuta, genes that encode 
photosynthesis proteins (e.g., ndh genes) are lost and have 
become pseudogenes due to the lifestyle of dependency on a 
host plant (McNeal et al. 2007). Lee et al. (2007) reported 
gene relocation in cpDNA of Jasminum and Menodora 
(Oleaceae), which was the result of multiple and overlapping 
inversions. In addition to these mutations, gene duplication 
events are another characteristic type of phylogenetically 
informative change in cpDNA. For example, trnF sequences 
were investigated and found to be duplicated in cruciferous 
plants, although only part of the gene was duplicated 
(Schmickl et al. 2007). For these reasons, the number of 
chloroplast genomes of green plants uploaded to NCBI 
has risen to 471, of which 372 are from angiosperms 
(http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi? 
taxid=2759&opt=plastid#pageTop, last accessed May 13, 
2014). 
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Fig. 1. — Photographs of Paris verticil lata. (A) Young plant, (B) plant with flower, and (0 close up of flower. 



Such increasing resources will allow exploration of evolu- 
tion among plants. Paris verticillata M.Bieb (fig. 1) is a member 
of the tribe Parideae of the family Melanthiaceae (Angiosperm 
Phylogeny Group 2009). The species is widespread in East Asia 
and has been used as folk medicine for asthma and chronic 
bronchitis (Ahn 1998). Recently, a new phenolic amide and 
pyrrolizidine alkaloids extracted from the roots of P. verticillata 
were found to have potential cytotoxicity against human 
tumor cell lines (Lee et al. 2008; Kim et al. 2010). Although 
the medicinal features of P. verticillata have been studied, no 
genomic work on the plant has yet been conducted. We 
therefore analyzed the complete chloroplast genome se- 
quence of P. verticillata and compared the cpDNA features 
with previously reported cpDNA in Liliales (Liu et al. 2012; 
Bodin et al. 2013; Do et al. 2013; Kim JS and Kim J-H 
2013). Furthermore, we examined whether the specific 
changes in cpDNA of P. verticillata are found in other species 
in Liliales to investigate the evolutionary pattern and to provide 
useful molecular information for further research on this po- 
tential medicinal plant. 

Results and Discussion 

Genome Assembly, Features, and Comparisons with 
Other Liliales cpDNAs 

The 96,169 reads (range: 40-680 bp) of P verticillata cpDNA 
generated by the 454 system were assembled against the 
reference sequence of Chionographis japonica (Bodin et al. 
2013). Out of them, 327 reads (0.34%) were assembled to 
C. japonica, with an average length of 365 bp and covering 
35% of the reference sequences with 97% pairwise identity. 
Because of the low covering percentage of reads to reference 
sequence (35%) and low assembly coverage (1 .1 x), PGR and 
Sanger sequencing methods were conducted to complete the 
cpDNA sequence and to confirm the regions assembled from 



454 system reads (supplementary fig. SI, Supplementary 
Material online). 

The cpDNA sequence of P. verticillata (accession number: 
KF433485) was complete and of length 1 57,379 bp, in which 
82,726 bp encompass the LSG region, 17,907 bp the SSG 
region, and 28,373 bp the length of each IR region (fig. 2). 
The AT and GG contents are 62.4% and 37.6%, respectively 
(table 1). The cpDNA consists of 1 1 5 unique genes, which are 
composed of 81 protein-coding genes, 30 tRNAs, and 4 
rRNAs. Among the protein-coding genes, nine genes contain 
one intron and three genes possess two introns {ycf3, cIpP, 
and rps12) of which rps12 was frans-spliced. In addition, 25 
coding regions are duplicated in the IR region. However, only 
part of ycf1 was duplicated in the junction between the IRB 
and SSG regions. Similarly, rps3 in the junction between the 
IRA and LSG regions is not functional because of incomplete 
duplication (6 bp). The ycf15 and ycf68 were pseudogenes 
according to the presence of several internal stop codons 
within the coding regions. 

The basic features within Liliales were compared among 
reported cpDNA sequences (table 1). The results indicated 
that the length of P. verticillata (157,379 bp) is similar to 
that of Smilax china (157,878 bp) and longer than those of 
other species (range: 152,793-1 55,510 bp). The AT and GG 
contents of Melanthiaceae species were almost the same 
(62.3% and 37.7%, respectively), whereas these features 
are different in other families of Liliales (table 1). The gene 
content and order are similar among the species examined, 
but rps1 6 was deleted completely in C. Japonica and partially 
in Veratrunn patulum. In addition, infA was lost in 5. china. The 
IR borders were expanded varyingly not only in Melanthiaceae 
but also in other taxa (table 1). The IR/SSG boundary was 
identified by the incomplete duplication oi ycfl in all species 
examined, whereas the IR/LSG junction was expanded to full 
trnH_GUG {V. patulum), part of rps19 {Alstroemeria aurea and 
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Fig. 2. — Map of the Paris verticillata chloroplast genome. Genes shown outside of the outer circle are transcribed counterclockwise, whereas those 
shown inside are transcribed clockwise. The thick lines in small circles indicate the IR regions. The asterisks indicate those genes with introns. # indicates 
triplicated genes. 



Lilium longiflorum), part of rpl22 (5. china), and part of rps3 
{Cjaponica and P. verticillata). In general, the cpDNA structure 
of P. verticillata is sinnilar to those of other Liliales species, such 
as V. patulum (Do et al. 201 3), C japonica (Bodin et al. 201 3), 
A. aurea, L longiflorum (Kim JS and Kim J-H 2013), and 



5. china (Liu et al. 2012) (table 1). The gene content and 
order are also similar among the species examined. 
However, the loss of infA and rpsl 6 as well as partial deletion 
of rpsl 6, which is uncommon in the Liliales, could be unique 
evolutionary events in such species. 
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Table 1 

Characteristics of Chloroplast Genomes among Liliales 



Species 
(Family) 



Paris verticillata 
(Melanthiaceae) 



Chionographis 

japonica 
(Melanthiaceae) 



Veratrum 
patulum 
(Melanthiaceae) 



Lilium 
longiflorum 
(Liliaceae) 



Smilax china 
(Smilacaceae) 



Alstroemeria aurea 
(Alstroemeriaceae) 



Accession number 
Protein-coding 

genes 
tRNAs 
rRNAs 
Length (bp) 
LSC 
SSC 
IRs 

AT content (%) 
GC content (%) 
IRB-SSC junction 

IRA-LSC junction 

Length of IGS 
between rpl23 and 
ycf2 (bp) 



KJ433485 
81 

30 
4 

157,379 
82,726 
17,907 
28,373 

62.4 

37.6 

ycf1 
(pseudogene) 

rps3 
(pseudogene) 
591 bp 



KF951065 
80 

30 
4 

154,646 
81,653 
18,195 
27,399 

62.3 

37.7 

ycf1 
(pseudogene) 

rps3 
(pseudogene) 
303 bp 



KF437397 
81 

30 
4 

153,699 
83,372 
17,607 
26,360 
62.3 
37.7 
ycf1 
(pseudogene) 
trnH_GUG 

305 bp 



KC968977 
81 

30 
4 

152,793 

82,230 

17,523 

26,520 

62.98 

37.02 

ycf1 
(pseudogene) 

rps19 
(pseudogene) 

308 bp 



HM536959 
80 

30 
4 

157,878 

84,608 

18,536 

27,367 

62.75 

37.25 
ycf1 
(pseudogene) 

rpl22 
(pseudogene) 

308 bp 



KC968976 
81 

30 
4 

155,510 

84,241 

17,867 

26,701 

62.74 

37.26 

ycf1 
(pseudogene) 

rps19 
(pseudogene) 

308 bp 



The expansion and contraction of the IR region are highly 
variable among species, within not only Melanthiaceae but 
also the order Liliales, in that the junctions were differentiated 
from trnH_GUG to rps3 (table 1). The IR junction of P. verti- 
cillata is in agreement with a previous report (Wang et al. 
2008), which suggested that the junction of the order 
Liliales IR/LSC region included the trnH-rps19 cluster. 
However, to date, the IR regions have only been examined 
in four of ten families of Liliales (Bodin et al. 2013; Do et al. 
2013). Therefore, further studies covering the junctions of IR 
and LSC as well as the SSC region in all families of Liliales are 
required. 

trnl_CAU Triplication and cemA Pseudogenization 

The length of the intergenic spacer (IGS) between rpl23 and 
ycf2, which contains trnl_CAU, varies among P. verticillata 
and other Liliales species (303-591 bp; table 1). Paris verticil- 
lata possesses the longest IGS, containing three copies of 
trnl_CAU (fig. 3A). Such a pattern was not detected in 
other complete cpDNAs from Liliales (Liu et al. 2012; Bodin 
etal. 2013; Do etal. 2013; Kim JSand Kim J-H 2013). Further 
analysis through REPuter (Kurtz et al. 2001) recognized 
tandem repeat sequences of 139 bp, which included the 
74-bp trnl_CAU sequences, in the relevant region of 
P. verticillata (fig. 3B), but not in other genomes (data not 
shown). 

Gene duplication in the chloroplast genome occurs mainly 
within the IR regions due to the IR expansion (Goulding et al. 
1996; Xiong et al. 2009). Most duplicated genes are associ- 
ated with tRNAs, which were found in the IGS of rbcL-psaJ of 



A 172 bp 65 bp 65 bp 67 bp 



















trnl 




tml 




tmi 



















591 bp 



AAATCCATTTTCTTCCCTATGAGTTCCAGTATCGATAAGAATTCTAGTT 
CTT ACTGTTC AT ATGTTATGG T ATG A A T A T A CC AT ATC A A TTCGTT ATG 
T ATGG A TGC'ITA A C AGO A ATC ATCGT A A AT A AA ATA ACC A A ATTCdJfl 
TAGACTTATTG AACGTTCCATTGGCG TGCA TCCAGCAGGAA TTGAACCT 
A CGAA TTTGCCAA TTA TGA GTTGGGCGCTTTAA CCA TTCAGCCA TGGM 
CtTT A AC AGG A A TC ATCGT A A A T A A A ATA ACC A A A TTCC A AT AG A CTT 
A I I GAACG I"] CCCi r I GCjCGrG'Cl TCCA GCAGGAA TTGAA CCTA CGAA TT 
TGCCAA TTA TGA G TTGGGCGCTTTAA CCA TTCA GCCA TGGA TGCTTAAC 
AGGAATCATCGTAAATAAAATAACCAAATTCCAATAGACTTATTGAAC 
GTTCCGTTGGCG TGCA TCCA GCA GGAA TTGAA CCTA CGAA TTTGCCAA T 
^TGA GTTGGGCGCTTTAA CCA TTCA GCCA TGGA TGCTTAAC AQQ I ATC 
ATC AT A A AT A ACC A A ATTCC A ATTT A A ATG A A ATCTTT AGG AGG A AGC 

Fig. 3. — Illustration of trnl_CAU composition in Paris verticillata. (A) 
Positions of the trnl_CAU copies. (B) The nucleotide sequence of the rpl23- 
ycf2 IGS of P. verticillata, in which three tandem repeat units are high- 
lighted in different colors. The bold italic characters indicate the sequences 
of trnl_CAU. 

Jasminum and Menodora of the family Oleaceae (Lee et al. 
2007) and in trnC-rpoB of Ginkgo biloba (Lin et al. 2012). In 
G. biloba, trnC-GCA was duplicated at least twice and evolved 
into a cluster of three tRNA genes: trnY-AUA, trnC-ACA, and 
trnSeC-UCA. Similarly, the duplication of trnF-GAA was also 
detected in Brassicaceae (Schmickl et al. 2007). Although 
gene duplication events were reported previously, the mech- 
anisms underlying these events remain unclear (Lee et al. 
2007; Erixon and Oxelman 2008; Schmickl et al. 2007; Lin 
et al. 2012). Guo et al. (2007) proposed that tandem repeats 
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Paris vertkUhta 
Chiographis japonica 
Veratrum patulam 
Lilium fongijloram 
Smilax china 
Aistroemeria aurea 

B 

Paris vertkillatu 



Chionegraphis japonica 
Veratrum paiutum 
Lilium iongijlarum 
Smilax china 
Aistroemeria a area 



1 10 20 30 40 50 60 

I I I I I I I 

kjmmmk CACACA(y\CATaw;nCOCTOCCATATCTCATATCTA^^^ 



ATGAMAAMMMMTMATAAAACGTT(W:nCXTGra 

hi mmmi AAAACATTAACnCCnGCCATATTTTGTATCTATAOrA 

ATGAAAAAAAAA TCATTGACnCOCTGCCATATCTTGTATCCATAGTA 



AT GAAAAAAA 6G ^AAAGCGnGACTTCGCTCCXIATATCTTGTATTTATAGTA 

kimmtm- ^AAAGCATTGACnCOCTOCCCTATArrGTAGCTATAGTA 



MKKKHTH I Of PP I SH I YS I FALVGLSL I *BKSGTLDY*LVEYPA I RNFFE*YSfiEKRSRQ 



MKKKK I NKTLTSLPYL I S I VFLPWWVS I SmSLEPGVTNWWNTlNQSETFLND I QEKNVL 
IKKN — KTLTSFPYFVS I VFLPWWFSLFFhKSLEPLVTNWWNTRQSETFLND I QEKNVL 

MKKK ^SLTSLPYLVS I VFLPWWVSLSFNKSLEPWV I NWWNTRQSETFLND I QEKR I L 

MKKR — KALTSLPYLVF I VFLRNWVSLSFNKSLEPWV I NWWNTTOETFLND I QEKNVL 
MKKK — KAITSLPY I VA I VFLPVIIWVSFSFEKSIJEPWVTOWWNTRQSE I FLNN I QEKNVL 



Fig. 4. — The alignment of partial cemA sequences among Paris verticillata and related taxa. {A) Alignment of partial nucleotide sequences of cemA. The 
poly(A) sequences are underlined. The colored boxes show the SSR. {B) Alignment of partial amino acid sequences of cemA. The asterisks indicate stop 
codons. The letters shaded in color show differences in the amino acid compositions of the cemA genes among species. The underlined characters show 
amino acid sequences that are similar among P. verticillata and other species. 



found in legunne chloroplast genonnes were generated by ho- 
mology-facilitated illegitimate reconnbination. We suggest 
that the trnl_CAU duplication events nnay be attributable to 
homology-facilitated illegitimate recombination, which has 
occurred twice to generate three tandem repeats in 
P. verticillata. 

The ce/T7/\-coding sequence of P. verticillata contained a 
poly(A) sequence (9 bp) and a small single repeat (SSR) CA 
unit (fig. 4/\). The other species of Liliales have poly(A) se- 
quences of variable lengths (8-1 2 bp) following the start 
codon, and C. japonica has the longest such sequence (12 
bases). However, the CA SSR was not detected in these 
taxa. Although the lengths of poly(A) sequences are variable, 
the amino acid sequences are very similar (fig. A-B). In contrast, 
the poly(A) sequence and the SSR unit caused a frameshift 
mutation in cemA of P. verticillata. Consequently, the cemA 
amino acid sequence of P. verticillata differs from those of 
other species, except for the first four amino acids (fig. A-B). 

In addition to the pseudogenization of hypothetical open- 
reading frames {ycfl5, ycf68) in C. japonica, V. patulum, 
A. aurea, L. longiflorum, and 5. china, dysfunctional protein- 
coding genes have also been reported in many land plants. Lin 
et al. (2012) reported the pseudogenization of rpl23 in 
G. biloba caused by the truncated 5^-region. The loss of func- 
tional genes has also been found in parasitic plants. For ex- 
ample, many genes such as atpB, rbcL, ndhF, and rpoC2 were 
found to have been lost or to be nonfunctional in Cistanche 
deserticola due to its parasitic lifestyle with its host Haloxylon 
ammodendron (Li et al. 2013). The cemA gene, assumed to 
encode a b-type heme protein of unknown function (Willey 
and Gray 1990), was reported to have been lost in C. deserti- 
cola, Epifagus virginiana, Rhizanthella gardneri, and Neottia 
nidus-avis (Wolfe et al. 1992; Delannoy et al. 2011; 



Logacheva et al. 2011; Li et al. 2013). Different lengths of 
poly(A) sequence have also been observed in other species 
(Yang et al. 2010). Although transcriptome data have been 
analyzed, Yang et al. (2010) concluded that whether cemA 
can be translated into protein remains unclear. The cemA 
sequenced in this study is suspected to be pseudogene 
because of the presence of several stop codons caused by 
the poly(A) sequence and SSR of CA at the beginning of 
the coding region in P. verticillata (tribe Melanthieae of the 
Melanthiaceae; fig. 4B). However, it is thought to be func- 
tional in other species such as V. patulum (tribe Melanthieae of 
Melanthiaceae), C. japonica (tribe Chionographideae of 
Melanthiaceae), A. aurea (Alstroemeriaceae), L. longiflorum 
(Liliaceae), and 5. china (Smilacaceae) because of the absence 
of internal stop codons. Consequently, this mutation may 
occur only in the tribe Parideae of Melanthiaceae and could 
be useful for further research in not only Melanthiaceae but 
also Liliales species. The loss of cemA in parasitic plants could 
be explained by the dependence on the host plants. However, 
P. verticillata is autotrophic. Therefore, further research is re- 
quired to clarify the impact of cemA pseudogenization in this 
species. 

Phylogenetic Analysis 

Phylogenetic relationships between species in Liliales and 
other monocots, and dicots, were explored (fig. 5). The results 
showed that Liliales was a monophyletic group; the bootstrap 
value was high (BP 100). Liliales is a sister group of other spe- 
cies of Aparagales {Oncidium Gower Ramsey, Erycina pusilla, 
Cymbidium aloifolium, Phalaenopsis aphrodite subsp. formo- 
sana, and Yucca schidigera), Poales {Anomochloa 
marantoidea, Bambusa emeiensis, Dendrocalamus latiflorus. 
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Fig. 5. — Phylogenetic tree inferred by RAxML using nucleotide sequences of 76 protein-encoding regions from 40 species. Bootstrap values (>50) are 
shown above the branches. The light green color box shows the eudicots group whereas the light gray color box indicates the monocots species. The names 
in the right side of phylogenetic tree represent the classification of species at order level. 
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and Lolium perenne), Zingiberales {Musa acuminata subsp. 
malaccensis), and Arecales {Cocos nucifera and Phoenix dac- 
tylifera) (BP 91). Within Liliales, Melanthiaceae (including 
P. verticillata, V. patulunn, and C. japonica) is sister to the 
Smilacaceae (5. china) and Liliaceae (/.. longiflorum). Also, 
the Alstroemeriaceae {A. aurea) is sister to the Colchicaceae 
{Colchicum autumnale and Gloriosa superba). The familial re- 
lationships defined in this study were identical to those delin- 
eated in a previous work (Kim et al. 2013), in which 
relationships among all families in the Liliales were 
investigated. 

Conclusions 

Here, we report the first data of trnl_CAU triplication and 
cennA pseudogenization in Melanthiaceae inferred from the 
complete cpDNA sequence of P. verticillata. Notably, these 
features were not noted in the previous studies on cpDNA 
of either the Melanthiaceae or the Liliales. Therefore, these 
patterns will be useful for understanding the phylogeny and 
evolution of these species. However, the detailed mechanisms 
and evolutionary impacts of these findings remain unclear, 
and further investigations are therefore required. 

Materials and Methods 

Taxon Sampling, cpDNA Extraction, Sequencing, and 
Assembly 

Paris verticillata was collected in South Korea, and a voucher 
specimen was deposited in the herbarium of Gachon 
University (voucher number: GCU02222). The plant materials 
used in this study were obtained from the Korean National 
Research Resource Center (Medicinal Plants Resources Bank 
NRF-201 0-0005790) supported by the Korea Research 
Foundation with resources provided by the Ministry of 
Education, Science, and Technology in 2013. Fresh leaves 
(50 g) of P. verticillata were used for chloroplast isolation em- 
ploying the Percoll gradient buffer method (Kim JS and Kim 
J-H 2013). A DNeasy Plant Mini Kit (Qiagen, Seoul, South 
Korea) was used to extract cpDNA from purified chloroplasts. 
The 454 sequencing system (Roche Applied Science, 
Penzberg, Germany) was employed to sequence cpDNA of 
P. verticillata. The raw data were uploaded to Geneious 
Version 6.1, created by Biomatters Ltd. (Auckland, New 
Zealand), to assemble sequencing data. The C. japonica 
cpDNA sequence (Bodin et al. 2013) was chosen as a refer- 
ence to identify gaps in the sequence of P. verticillata. Prior to 
assembly, raw sequences that included ambiguous bases 
("N") were excluded. The remaining sequences were 
mapped using the default settings of Geneious. Only reads 
that were in excess of 90% identical to reference sequence 
were selected to identify gaps in P. verticillata cpDNA. After 
assembly, identification of gaps and calculation of assembly 
coverage, specific primers, designed with the aid of Primer3 



(Untergrasser et al. 2012), and candidate Liliales primers 
(Bodin et al. 2013) were used to fill all gaps, to confirm am- 
biguous sequences including low assembly coverage regions 
and to identify the borders of the LSC, SSC, and IR regions 
through PGR and Sanger sequencing method. 

Data Analysis 

The gene content and order were identified with the aid of 
Geneious and adjusted manually. The tRNAscan-SE 
(Schattner et al. 2005) was used to confirm tRNAs. 
Ambiguous bases in coding regions were checked using 
data in NCBI (http://blast.ncbi.nlm.nih.gov/, last accessed 
May 13, 2014) and confirmed by Sanger sequencing. In par- 
ticular, specific primer pairs were used to verify triplication of 
trnl_CAU and pseudogenization of cemA, ycf15, and ycf68. 
The cpDNA map of P. verticillata was constructed using 
GenomeVx (Connant and Wolfe 2008). The sequences of 
cennA were aligned using MUSCLE (Edgar 2004), which is 
included in the Geneious program, and manual adjustments 
were made when necessary. Sequencher version 5.0 (Gene 
Codes Co., Ann Arbor, Ml) was used to assemble complete 
sequences of trnl_CAU and cennA. The REPuter program 
(Kurtz et al. 2001) was used to detect repeat units in the 
rpl23-ycf2 regions. 

Phylogenetic Analysis 

Phylogenetic analysis was performed using data on 76 
protein-encoding genes from cpDNA sequences of 40 
species (supplementary table SI, Supplementary Material 
online). The nucleotide sequences were aligned using 
CLUSTALW (Hall 1999), which is included in the Geneious 
program. Phylogenetic trees were reconstructed using 
RAxML (Stamatakis et al. 2008), which is available online 
(http://embnet.vital-it.ch/raxml-bb/index.php, last accessed 
May 13, 2014). Substitution of GTR+G was modeled for 
the entire data matrix. Maximum-likelihood bootstrap anal- 
ysis was calculated using 100 replications employing the 
rapid bootstrapping approach implemented in R/\xML. The 
phylogenetic tree was drawn using FigTree v1 .3 program. 

Supplementary Material 

Supplementary figure SI and table SI are available at 
Genome Biology and Evolution online (http://www.gbe. 
oxfordjournals.org/). 
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